Goto

Collaborating Authors

 anomaly type


PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer

Neural Information Processing Systems

Video anomaly detection (VAD) is a critical yet challenging task due to the complex and diverse nature of real-world scenarios. Previous methods typically rely on domain-specific training data and manual adjustments when applying to new scenarios and unseen anomaly types, suffering from high labor costs and limited generalization. Therefore, we aim to achieve generalist VAD, \ie, automatically handle any scene and any anomaly types without training data or human involvement. In this work, we propose PANDA, an agentic AI engineer based on MLLMs. Specifically, we achieve PANDA by comprehensively devising four key capabilities: (1) self-adaptive scene-aware strategy planning, (2) goal-driven heuristic reasoning, (3) tool-augmented self-reflection, and (4) self-improving chain-of-memory.



CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset

Neural Information Processing Systems

Machine learning models are increasingly being deployed in real-world contexts. However, systematic studies on their transferability to specific and critical applications are underrepresented in the research literature. An important example is visual anomaly detection (V AD) for robotic power line inspection.



Pistachio: Towards Synthetic, Balanced, and Long-Form Video Anomaly Benchmarks

arXiv.org Artificial Intelligence

Automatically detecting abnormal events in videos is crucial for modern autonomous systems, yet existing Video Anomaly Detection (VAD) benchmarks lack the scene diversity, balanced anomaly coverage, and temporal complexity needed to reliably assess real-world performance. Meanwhile, the community is increasingly moving toward Video Anomaly Understanding (VAU), which requires deeper semantic and causal reasoning but remains difficult to benchmark due to the heavy manual annotation effort it demands. In this paper, we introduce Pistachio, a new VAD/VAU benchmark constructed entirely through a controlled, generation-based pipeline. By leveraging recent advances in video generation models, Pistachio provides precise control over scenes, anomaly types, and temporal narratives, effectively eliminating the biases and limitations of Internet-collected datasets. Our pipeline integrates scene-conditioned anomaly assignment, multi-step storyline generation, and a temporally consistent long-form synthesis strategy that produces coherent 41-second videos with minimal human intervention. Extensive experiments demonstrate the scale, diversity, and complexity of Pistachio, revealing new challenges for existing methods and motivating future research on dynamic and multi-event anomaly understanding.



CableInspect-AD: An Expert-Annotated Anomaly Detection Dataset

Neural Information Processing Systems

Machine learning models are increasingly being deployed in real-world contexts. However, systematic studies on their transferability to specific and critical applications are underrepresented in the research literature. An important example is visual anomaly detection (V AD) for robotic power line inspection.


Scalable, Technology-Agnostic Diagnosis and Predictive Maintenance for Point Machine using Deep Learning

arXiv.org Artificial Intelligence

The Point Machine (PM) is a critical piece of railway equipment that switches train routes by diverting tracks through a switchblade. As with any critical safety equipment, a failure will halt operations leading to service disruptions; therefore, pre-emptive maintenance may avoid unnecessary interruptions by detecting anomalies before they become failures. Previous work relies on several inputs and crafting custom features by segmenting the signal. This not only adds additional requirements for data collection and processing, but it is also specific to the PM technology, the installed locations and operational conditions limiting scalability. Based on the available maintenance records, the main failure causes for PM are obstacles, friction, power source issues and misalignment. Those failures affect the energy consumption pattern of PMs, altering the usual (or healthy) shape of the power signal during the PM movement. In contrast to the current state-of-the-art, our method requires only one input. We apply a deep learning model to the power signal pattern to classify if the PM is nominal or associated with any failure type, achieving >99.99\% precision, <0.01\% false positives and negligible false negatives. Our methodology is generic and technology-agnostic, proven to be scalable on several electromechanical PM types deployed in both real-world and test bench environments. Finally, by using conformal prediction the maintainer gets a clear indication of the certainty of the system outputs, adding a confidence layer to operations and making the method compliant with the ISO-17359 standard.



TAB: Unified Benchmarking of Time Series Anomaly Detection Methods

arXiv.org Artificial Intelligence

Time series anomaly detection (TSAD) plays an important role in many domains such as finance, transportation, and healthcare. With the ongoing instrumentation of reality, more time series data will be available, leading also to growing demands for TSAD. While many TSAD methods already exist, new and better methods are still desirable. However, effective progress hinges on the availability of reliable means of evaluating new methods and comparing them with existing methods. We address deficiencies in current evaluation procedures related to datasets and experimental settings and protocols. Specifically, we propose a new time series anomaly detection benchmark, called TAB. First, TAB encompasses 29 public multivariate datasets and 1,635 univariate time series from different domains to facilitate more comprehensive evaluations on diverse datasets. Second, TAB covers a variety of TSAD methods, including Non-learning, Machine learning, Deep learning, LLM-based, and Time-series pre-trained methods. Third, TAB features a unified and automated evaluation pipeline that enables fair and easy evaluation of TSAD methods. Finally, we employ TAB to evaluate existing TSAD methods and report on the outcomes, thereby offering a deeper insight into the performance of these methods. Besides, all datasets and code are available at https://github.com/decisionintelligence/TAB.